Improvements in RWTH LVCSR evaluation systems for Polish, Portuguese, English, urdu, and Arabic

نویسندگان

  • M. Ali Basha Shaik
  • Zoltán Tüske
  • Muhammad Ali Tahir
  • Markus Nußbaum-Thom
  • Ralf Schlüter
  • Hermann Ney
چکیده

In this work, Portuguese, Polish, English, Urdu, and Arabic automatic speech recognition evaluation systems developed by the RWTH Aachen University are presented. Our LVCSR systems focus on various domains like broadcast news, spontaneous speech, and podcasts. All these systems but Urdu are used for Euronews and Skynews evaluations as part of the EUBridge project. Our previously developed LVCSR systems were improved using different techniques for the aforementioned languages. Significant improvements are obtained using multilingual tandem and hybrid approaches, minimum phone error training, lexical adaptation, open vocabulary long short term memory language models, maximum entropy language models and confusion-network based system combination.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

RWTH LVCSR systems for quaero and EU-bridge: German, Polish, Spanish and Portuguese

In this paper, German, Polish, Spanish, and Portuguese large vocabulary continuous speech recognition (LVCSR) systems developed by the RWTH Aachen University are presented. All the above mentioned systems for the aforementioned languages are used for the Quaero and EU-Bridge project evaluations. The LVCSR systems developed for these competitive evaluations focus on various domains like broadcas...

متن کامل

The RWTH Aachen German and English LVCSR systems for IWSLT-2013

In this paper, German and English large vocabulary continuous speech recognition (LVCSR) systems developed by the RWTH Aachen University for the IWSLT-2013 evaluation campaign are presented. Good improvements are obtained with state-of-the-art monolingual and multilingual bottleneck features. In addition, an open vocabulary approach using morphemic sub-lexical units is investigated along with t...

متن کامل

Recent improvements of the RWTH GALE Mandarin LVCSR system

This paper describes the current improvements of the RWTH Mandarin LVCSR system. We introduce a new reduced toneme set developed at RWTH. We are using different toneme sets and pronunciation lexica. For the purpose of discriminative training we will show a fast way to transform word lattices between systems using different toneme sets and pronunciation lexica. In addition to various acoustic fr...

متن کامل

The RWTH Aachen machine translation system for IWSLT 2011

In this paper the statistical machine translation (SMT) systems of RWTH Aachen University developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2011 is presented. We participated in the MT (English-French, Arabic-English, ChineseEnglish) and SLT (English-French) tracks. Both hierarchical and phrase-based SMT decoders are applied. A number of ...

متن کامل

The RWTH Aachen speech recognition and machine translation system for IWSLT 2012

In this paper, the automatic speech recognition (ASR) and statistical machine translation (SMT) systems of RWTH Aachen University developed for the evaluation campaign of the International Workshop on Spoken Language Translation (IWSLT) 2012 are presented. We participated in the ASR (English), MT (English-French, Arabic-English, ChineseEnglish, German-English) and SLT (English-French) tracks. F...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015